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Abstract 

We study the approximability of predicates on k variables from a do- 
main [q], and give a new sufficient condition for such predicates to be 
approximation resistant under the Unique Games Conjecture. Specifi- 
cally, we show that a predicate P is approximation resistant if there exists 
a balanced pairwise independent distribution over [q] k whose support is 
contained in the set of satisfying assignments to P. 

Using constructions of pairwise indepenent distributions this result 
implies that 

• For general k > 3 and q>2, the Max fc-CSP 9 problem is UG-hard 
to approximate within <j riog2 + e . 

• For k > 3 and q prime power, the hardness ratio is improved to 
kq(q - l)/q k + e. 

• For the special case of q = 2, i.e., boolean variables, we can sharpen 
this bound to (k + C(fc 0,525 )) /2 k + e, improving upon the best pre- 
vious bound of 2k/2 k + e (Samorodnitsky and Trevisan, STOC'06) 
by essentially a factor 2. 

• Finally, for q = 2, assuming that the famous Hadamard Conjecture 
is true, this can be improved even further, and the O(fc ' 525 ) term 
can be replaced by the constant 4. 



1 Introduction 



In the Max fc-CSP problem, we are given a set of constraints over a set of 
boolean variables, each constraint being a boolean function acting on at most k 
of the variables. The objective is to find an assignment to the variables satisfying 
as many of the constraints as possible. This problem is NP-hard for any k > 2, 
and as a consequence, a lot of research has been focused on studying how well 
the problem can be approximated. We say that a (randomized) algorithm has 
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approximation ratio a if, for all instances, the algorithm is guaranteed to find 
an assignment which (in expectation) satisfies at least a ■ Opt of the constraints, 
where Opt is the maximum number of simultaneously satisfied constraints, over 
any assignment. 

A particularly simple approximation algorithm is the algorithm which simply 
picks a random assignment to the variables. This algorithm has a ratio of l/2 k . 
It was first improved by Trevisan [22J who gave an algorithm with ratio 2/2 fc for 
Max fc-CSP. Recently, Hast [8] gave an algorithm with ratio fi(fc/(log k2 k )), 
which was subsequently improved by Charikar et al. [5j who gave an algorithm 
with approximation ratio c • k/2 k , where c > 0.44 is an absolute constant. 

The PCP Theorem implies that the Max fc-CSP problem is NP-hard to 
approximate within l/c fc for some constant c > 1. Samorodnitsky and Tre- 
visan [20] improved this hardness to 2 2 ^ k /2 k , and this was further improved to 
2%/2fc J 2^ by Engebretsen and Holmerin [7]. Finally, Samorodnitsky and Trevisan 
[21] proved that, if the Unique Games Conjecture [12] is true, then the Max 
fc-CSP problem is hard to approximate within 2k/2 k . To be more precise, the 
hardness they obtained was 2^2 /2 fe , which is (fc + l)/2 fc for k = 2 r -l, but 
can be as large as 2k/2 k for general k. Thus, the current gap between hardness 
and approximability is a small constant factor of 2/0.44. 

For a predicate P : {0, l} fc -> {0, 1}, the Max CSP(P) problem is the 
special case of Max fc-CSP in which all constraints are of the form P(h, ■ . . , Ik), 
where each literal U is either a variable or a negated variable. For this problem, 
the random assignment algorithm achieves a ratio of m/2 k , where m is the 
number of satisfying assignments of P. Surprisingly, it turns out that for certain 
choices of P, this is the best possible algorithm. In a celebrated result, Hastad 
[10] showed that for P(xi, x 2 , x 3 ) — x\ © x 2 © x 3 , the Max CSP(P) problem is 
hard to approximate within 1/2 + e. 

Predicates P for which it is hard to approximate the Max CSP(P) problem 
better than a random assignment, are called approximation resistant. A slightly 
stronger notion is that of hereditary approximation resistance - a predicate P is 
hereditary approximation resistant if all predicates implied by P are approxima- 
tion resistant. A natural and important question is to understand the structure 
of approximation resistance. For k = 2 and k — 3, this question is resolved - 
predicates on 2 variables are never approximation resistant, and a predicate on 
3 variables is approximation resistant if and only if it is implied by an XOR of 
the three variables jTUl [23] . For k = 4, Hast [9] managed to classify most of the 
predicates with respect to to approximation resistance, but for this case there 
does not appear to be as nice a characterization as there is in the case k = 3. It 
turns out that, assuming the Unique Games Conjecture, most predicates are in 
fact hereditary approximation resistant - as k grows, the fraction of such pred- 
icates tend to 1 [UJ. Thus, instead of attempting to understand the seemingly 
complicated structure of approximation resistant predicates, one might try to 
understand the possibly easier structure of hereditary approximation resistant 
predicates, as these constitute the vast majority of all predicates. 

A natural approach for obtaining strong inapproximability for the Max fc- 
CSP problem is to search for approximation resistant predicates with very few 
accepting inputs. This is indeed how all mentioned hardness results for Max 
fc-CSP come about (except the one implied by the PCP Theorem). 

It is natural to generalize the Max fc-CSP problem to variables over a 



2 



domain of size q, rather than just boolean variables. Without loss of generality 
we may assume that the domain is [q\. We call this the Max fc-CSP q problem. 
For Max /c-CSP 9 , the random assignment gives a l/g fc -approximation, and any 
/(fc)-approximation algorithm for the Max fc-CSP problem gives a /(fc|~log 2 q] )- 
approximation algorithm for the Max fc-CSP g problem. Thus, Charikar et al.'s 
algorithm gives a 0.44fclog 2 g/q fe -approximation in the case that q is a power of 
2. The best previous inapproximability for the Max k-CSP q problem is due 
to Engebretsen [6], who showed that the problem is NP-hard to approximate 
within q°^/q k . 

Similarly to q = 2, we can define the Max CSP(P) problem for P : [q] k — > 
{0, 1}. Here, there are several natural ways of generalizing the notion of a literal. 
One possible definition is to say that a literal I is of the form ir(xi), for some 
variable Xi and permutation tt : [q] — > [q], A stricter definition is to say that a 
literal is of the form Xi + a, where, again, Xi is a variable, and a € [q] is some 
constant. In this paper, we use the second, stricter, definition. As this is a 
special case of the first definition, our hardness results apply also to the first 
definition. 

1.1 Our contributions 

Our main result is the following: 

Theorem 1.1. Let P : [q] k -> {0, 1} be a k-ary predicate over [q], and let /i be 
a distribution over [q] h such that 

and for all 1 < i ^ j < k and all a, b € [q], it holds that 

Pr [xi = a,Xj =6] = 1/q 2 - 

Then, for any e > 0, the UGC implies that the Max CSP(P) problem is NP- 
hard to approximate within 

mil , r 

q K 

i.e., P is hereditary approximation resistant. 

Using constructions of pairwise independent distributions, we obtain the 
following corollaries: 

Theorem 1.2. For any k > 3, q > 2, and e > 0, it is UG-hard to approximate 
the Max fc-CSP g problem within 

q[log 2 k+1] ^,log 2 q . 

k + 6 < ~ k + e ' 

qK qK 

In the special case that k = 2 r — 1 for some r the hardness ratio improves to 

qK 
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This already constitutes a significant improvement upon the q°i^ k )/q k - 
hardness of Engebretsen, and in the case that q is a prime power we can improve 
this even further. 

Theorem 1.3. For any k > 3, q = p e for some prime p, and e > 0, it is 
UG-hard to approximate the Max fc-CSP g problem within 

k(q-l)q 
q K 

In the special case that k — (q r — l)/(q — 1) for some r, the hardness ratio 
improves to 

k(q-i) + l kq 

-jT -T + e - 

q K q K 

Neither of these two theorems improve upon the results of [21] for the case 
of q = 2. However, the following theorem does. 

Theorem 1.4. For any k > 3 and e > 0, it is UG-hard to approximate the 
Max fc-CSP problem within 

If the Hadamard Conjecture is true, it is UG-hard to approximate the Max 
fc-CSP problem within 

4r(fc + l)/4l fc + 4 
— — — - + e < h f 

Thus, we improve the hardness of [21] by essentially a factor 2, decreasing 
the gap to the best algorithm from roughly 2/0.44 to roughly 1/0.44. 



1.2 Related work 

It is interesting to compare our results to the results of Samordnitsky and Tre- 
visan [21]. Recall that using the Gowers norm, [21] prove that the Max k-CSP 
problem has a hardness factor of 2^ loe ^ fc+1 1 /2 k , which is (fc+l)/2 fe for k = 2 r -l, 
but can be as large as 2k /2 k for general k. 

Our proof uses the same version of the UGC, but the analysis is more direct 
and more general. The proof of [H] requires us to work specifically with a lin- 
earity hyper-graph test for the long codes. For this test, the success probability 
is shown to be closely related to the Gowers inner product of the long codes. In 
particular, in the soundness analysis it is shown that if the value of this test is 
too large, it follows that the Gowers norm is larger than for "random functions". 
From this it is shown that at least two of the functions have large influences 
which in turns allows us to obtain a good solution for the UGC. 

Our construction on the other hand allows any pairwise distribution to define 
a long-code test. Using [16] we show that if a collection of supposed long codes 
does better than random for this long code test, then at least two of them have 
large influences. 

Our proof has a number of advantages: first it applies to any pairwise inde- 
pendent distribution. This should be compared to [21] that require us to work 
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specifically with the hyper-graph linearity test. In particular our results allow 
us to obtain hardness results for Max CSP(P) for a wide range of P's. The 
results are general enough to accomodate any domain [q] (it is not clear if the 
results of [21] extend to larger domains) , and we are also able to obtain a better 
hardness factor for most values of k even in the q — 2 case. 

Also, our proof uses bounds on expectations of products under certain types 
of correlation, putting it in the same general framework as many other UGC- 
based hardness results, in particular those for 2-CSPs [13 } fl4" l l2j 13} fl8] . 

Finally, our proof gives parametrized hardness in the following sense. We 
give a family of hardness assumptions, called the (t, fe)-UGC. All of these as- 
sumptions follow from the UGC, and in particular the case t = 2 is known to 
be equivalent to the UGC. However, the (t, /c)-UGC assumption is weaker for 
larger values of t. For each value of t our results imply a different hardness of 
approximation factor. Specifically, if the (t, A:)-UGC is true for some t > 3, then 
the Max fc-CSP problem is NP-hard to approximate within O (k^/ 2 ^ 1 /2 k ) . 
Thus, even the (4, fc)-UGC gives a hardness of 0(k/2 k ), and for t < y/k/logk, 
the (t, fc)-UGC gives a hardness better than the best unconditional result known 

0- 

2 Definitions 
2.1 Unique Games 

We use the following formulation of the Unique Label Cover Problem: given is a 
/c-uniform hypergraph, where for each edge . . . , Vk) there are k permutations 
it i, . . . , 7Tfc on [L], We say that an edge (v±, . . . , Vk) with permutations 7Ti, . . . , Wk 
is £-wise satisfied by a labelling I : V — > [L] if there are i\ < 12 < ■ ■ ■ < it 
such that 7T,! (^(ui! )) = 7Ti 2 (^(ui 2 )) = . . . = iri t (£(vi t )). We say that an edge is 
completely satisfied by a labelling if it is k-wise satisfied. 

We denote by Opt t (A) e [0,1] the maximum fraction of i-wise satisfied 
edges, over any labelling. Note that Opt t+1 (A) < Opt t (A). 

The following conjecture is known to follow from the Unique Games Conjec- 
ture (see details below). 

Conjecture 2.1. For any 2 < t < k, and S > 0, there exists an L > such 
that it is NP-hard to distinguish between k-ary Unique Label Cover instances X 
with label set [L] with Opt fe (A) > 1 — d, and Opt t (A) < S. 

For particular values of t and k we will refer to the corresponding special 
case of the above conjecture as the (t, k)- Unique Games Conjecture (or the 
(t,*)-UGC). 

Khot's original formulation of the Unique Games Conjecture [12j is then 
exactly the (2,2)-UGC, and Khot and Regev [15] proved that this conjecture 
is equivalent to the (2, fc)-UGC for all k, which is what Samorodnitsky and 
Trevisan [21] used to obtain hardness for Max fc-CSP. 

In this paper, we mainly use the (3, fc)-UGC to obtain our hardness results. 
Clearly, since Opt t+1 (X) < Opt t (A), the (t, fc)-UGC implies the (t+1, fc)-UGC, 
so our assumption is implied by the Unique Games Conjecture. But whether 
the converse holds, or whether there is hope of proving this conjecture (or, say, 
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the (fc, fc)-UGC for large k) without proving the Unique Games Conjecture, is 
not clear, and should be an interesting direction for future research. 

2.2 Influences 

It is well known (see e.g. [13]) that each function / : [q] n — > K admits a unique 
Efron-Stein decomposition: f = Ylsc[ri\ fs wnere 

• The function f$ depends on xs = (i» : i £ 5) only. 

• For every S' % S, and every ys> £ [q] it holds that 

E[fs(x s )\xs> = y s >] = 0. 

For m < n we write /- m = J2s-\s\<m fs f° r the m-degree expansion of /. We 
now define the influence of the ith coordinate on /, denoted by Infj(/) by 

Inf i (/)=E[Var[/( i r)]]. (1) 

X Xi 

We define the m-degree influence of the ith coordinate on f, denoted by Inf^ m (/) 
by lnU(f^ m ). 

Recall that the influence Inf measures how much the function / depends 
on the i'th variable, while the low degree influences Inf- m (/) measures this for 
the low part of the expansion of /. The later quantity is closely related to the 
influence of / on "slightly noisy inputs". 

An important property of low-degree influences is that 

n 

£ Inf r" l (/)<™Var[/], 
i=l 

implying that the number of coordinates with large low-degree influence must 
be small. In particular, if / : [q] n — > [0, 1], then the the number of coordinates 
with low-degree influence at least r is at most r/m. 

2.3 Correlated Probability Spaces 

We will be interested in probability distributions supported in P _1 (l) C [q] k . 
It would be useful to follow [16J and view [q) k with such probability measure 
as a collection of k correlated spaces corresponding to the k coordinates. We 
proceed with formal definitions of two and k correlated spaces. 

Definition 2.2. Let (ft, fi) be a probability space over a finite product space 
fl = Qi x Q 2 - The correlation between Oi and (with respect to fi) is 

p(fii ) n 2 ;/i)=Hup{Cov[/i(a;i)/ 2 (a; 2 )] : /< : fi* - R, Varf/^)] = 1 }, 

where (xi,^) is drawn from (Q, n). 

Definition 2.3. Let (ft, /z) be a probability space over a finite product space 
rii=i anc l = Yiies ^ ne correlation of Qi, . . . , flk (with respect to 

fi) is 

p(Sl 1 , . . = max 

1 < < A 1 — 1 
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Of particular interest to us is the case where correlated spaces are defined 
by a measure that it i-wise independent. 



Definition 2.4. Let (Q, fi) be a probability space over a product space ft 
rii_i We say that [i is i-wise independent if, for any choice of i\ < 12 < 
< i t and b\ , . . . , b t with bj G fl^ , we have that 



Pr >n =h,.-.,w is =b s } = Y[ Pr K = 6 j] 

We say that (O, /z) is balanced if for every z G [A;], 6 € f2j, we have that 
Pr„, e (o, M )[^i = &] = 

The following theorem considers low influence functions that act on corre- 
lated spaces where the correlation is given by a t-wise independent probability 
measure for t > 2. It shows that in this case, the functions have almost the same 
distribution as if they were completely independent. Moreover, the result holds 
even if some of the functions have large influences as long as in each coordinate 
not more than t functions have large influences. 

Theorem 2.5 ([16J, Theorem 6.6 and Lemma 6.9). Let (O, /i) be a finite prob- 
ability space over fl = Yl i=1 &i with the following properties: 

(a) n is t-wise independent. 

(b) For all i £ [k] and bi £ fij, > 0. 

(c) p(n 1 ,...,O fe ;/i) < 1. 

Then for all e > there exists a r > and d > such that the following holds. 
Let /1, . . . , fk be functions fa : f2™ — > [0, 1] satisfying that, for all 1 < j < n, 



\{i ■■ luff" (fi)>r}\<t. 



Then 

E 



W 1 . 



Y\_fi(wi t i,...,w n>i ) 



k 



nE [fi(wii,...,w n ,i)] 




■ Wl 
1=1 



< c, 



where Wi, . . . ,w„ are drawn independently from (O, /x), and Wij G Qj denotes 
the jth coordinate ofwi. 

Note that a sufficient condition for (c) to hold in the above theorem is that 
for all w G fl, fJ.(w) > 0. 

Roughly speaking, the basic idea behind the theorem and its proof is that 
low influence functions cannot detect dependencies of high order - in particular 
if the underlying measure is pairwise independent, then low influence functions 
of different coordinates are essentially independent. 
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3 Main theorem 



In this section, we prove our main theorem. Note that it is a generalization of 
Theorem O 

Theorem 3.1. Let P : [q] k -> {0, 1} be a k-ary predicate over a (finite) domain 
of size q, and let n be a balanced t-wise independent distribution over [q] k such 
that P* X £([q] k ,n)[P( x )] > 0- Then, for any e > 0, the (t + 1, k)-UGC implies that 
the Max CSP(P) problem is NP-hard to approximate within 

\p-Hi)\ , e 

q k ■ Pr a . e([g] * )|i) [P(x)] 

In particular, note that if Pr xe n q -ik tlJi \[P(x)] = 1, i.e., if the support of /i is 
entirely contained in the set of satisfying assignments to P, then P is approxi- 
mation resistant. It is also hereditary approximation resistant, since the support 
of fi will still be contained in P _1 (l) when we add more satisfying assignments 
to P. 

Reduction. Given a fc-ary Unique Label Cover instance X, the prover writes 
down the table of a function f v : [q] L — > [q] for each v, which is supposed to be 
the long code of the label of the vertex v. Furthermore, we will assume that f v 
is folded, i.e., that for every x € [q] k and a G [q], we have f v (x + (a, . . . , a)) = 
f v (x) + a (where the definition of "+" in [q] is arbitrary as long as ([<?],+) is 
an Abelian group). When reading the value of f v {x\, . . . , xl), the verifier can 
enforce this condition by instead querying f v (x\ — x\, x-i — x±, . . . , xl — x\) and 
adding X\ to the result. Let rj > be a parameter, the value of which will be 
determined later, and define a probability distribution // on [q] k by 

fj!(w) = (1 - rj) ■ fj,(w) + i] ■ nu(w), 

where fJLu is the uniform distribution on [q] k , i.e., = l/q k . Given a proof 

E = {f v }veV °f supposed long codes for a good labelling of X, the verifier 
checks S as follows. 

Algorithm 1: The verifier V 

V(X, S - {f v } ve v) 

(1) Pick a random edge e = (vi, . . . , Vk) with permutations 

TTl , . . . , 7Tfc . 

(2) For each i £ [L], draw Wi randomly from ([q] k , fi'). 

(3) For each j £ [k], let Xj — Wij . . . Wlj, and let bj = 

fvjlfjiXj). 

(4) Accept if P(h,...,b k ). 



Lemma 3.2 (Completeness). For any 5, if Opt fe (X) > 1 — 5, then there is a 
proof S such that 

Pr[V(X, E) accepts] >(1-6)(1- rj) Pr [P{w)\ 

we([q] k .Ji) 



s 



Proof. Take a labelling I for X such that a fraction > 1 — 5 of the edges are 
k-wise satisfied, and let f v : [q] L — ► [q] be the long code of the label l{v) of 
vertex v. 

Let (vi,...,Vk) be an edge that is fc-wise satisfied by I. Then f Vx -K\ = 
/t>2 7r 2 = . . . = fv k ^k, each being the long code of i := iri(l(vi)). The probability 
that V accepts is then exactly the probability that P(wi) is true, which, since Wi 
is drawn from ([q] k , /j) with probability 1 — 77, is at least (1 — ??) P' r we([q] k ,n) [P( w )]- 

The probability that the edge e chosen by the verifier in step 1 is satisfied 
by i is at least 1 — <5, and so we end up with the desired inequality. □ 

Lemma 3.3 (Soundness). For any e > 0, 77 > 0, there is a constant S :— 
S(e, 77, t, k, q) > 0, such that if Opt t+1 (X) < S, then for any proof S, we have 



Pr[V{X, S) accepts] < l - ^ 



Proof. Assume that 



Pr[V(X, E) accepts] > 



\P-\1) 



+ e 



+ e. 



(2) 



We need to prove that this implies that there is a S := 5{e,ri,t 1 k,q) > such 
that Opt t+1 (X) > 5. 

Equation [2] implies that for a fraction of at least e/2 of the edges e, the 



probability that V(X, S) accepts when choosing e is at least 



e/2. 



Let e = Ufc) with permutations m, . . . ,7Tfc be such a "good" edge. For 

?j S V and a € [3], define g„ ia : [q] L — > {0, 1} by 



1 if f v (x) = a 
otherwise 



The probability that V accepts when choosing e is then exactly 



E 



E 

W\ ,...,Wl 



, W L -. 



which, by the choice of e, is greater than \P 1 (l)\/q k + e/2. This implies that 
there is some x £ P _1 (l) such that 



E 



> l/q k + (' 



n 



E [g VuXi Tn(wi t i,...,WL,i)]+e', 



where e' = e/2/|P _1 (l)| and the last equality uses that, because f Vi is folded 
and n is balanced, we have 'M^ Wlt ... tVlL \g VitXi (wi t i, . . . = 1/q. 

Note that because both \i and flu are t-wise independent, fi' is also i-wise 
independent. Also, we have that for each w € [q) k , > 77/g fe > 0, which 

implies both conditions (b) and (c) of Theorem 12.51 Then, the contrapositive 
formulation of Theorem 12.51 implies that there is an i S [L] and at least t + 1 
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indices J C [fc] such that Inf-_j (g Vj , Xj ) = Infj- {gv^x^j) > t for all j e J, 

where t and g? are functions of e, 77, t, fc, and 

The process of constructing a good labelling of X from this point is standard. 
For completeness, we give a proof in the appendix. Specifically, Lemma IA.1I 

/ \ t+1 

gives that Opt t+1 (A) > e/2 ( g^j J , which is a function of e, rj, t, k, and q, as 
desired. □ 

It is now straightforward to prove Theorem 13.11 

Proof of TheoremUUl Let c = Pi xe ([q]"^)[P{x)], s = and 77 = 

min(l/4, ||). Note that since the statement of the Theorem requires c > 
we also have s > and 7] > 0. Assume that the (t + l,fc)-UGC is true, and 
pick L large enough so that it is NP-hard to distinguish between k-ary Unique 
Label Cover instances X with Opt t+1 (X) < S and Opt fc (X) > 1 — S, where 
S = min(r] > 6(ec/4,r] i t,k,q)), where S(. . .) is the function from Lemma [331 By 
Lemmas 13.21 and 13. 3} we then get that it is NP-hard to distinguish between 
Max CSP(F) instances with Opt > (1 - S){1 - rf)c > (1 - 2r))c and Opt < 
s + ec/4. In other words, it is NP-hard to approximate the Max CSP(P) 
problem within a factor 

s + ec/4 ^ a (I + 4/7) A . , A ^ . 

< — + (1 + 4r7)e/4 <s/c + e 



(1 - 2^)0 

□ 



4 Inapproximability for MAX k-CSP q 

As a simple corollary to Theorem 13. 1[ we have: 

Corollary 4.1. Lett > 2 and let /i be a balanced t-wise independent distribution 
over [q] k . Then the (t + l,k)-UGC implies that that Max fc-CSP g problem is 
NP-hard to approximate within 

I Supp(/j)| 
q k 

Thus, we have reduced the problem of obtaining strong inapproximability for 
Max /c-CSP g to the problem of finding small t-wise independent distributions. 
As we are mainly interested in the strongest possible results that can be obtained 
by this method, our main focus will be on pairwise independence, i.e, t = 2. 
However, let us first mention two simple corollaries for general values of t. 

For q = 2, it is well-known that the binary BCH code gives a i-wise indepen- 
dent distribution over {0, l} fc with support size 

0(fcL*/2J) [j]. i n ther words, 
the (t + l,fc)-UGC implies that the Max fc-CSP problem is NP-hard to ap- 
proximate within C(fcr t / 2 l/2 fe ). Note in particular that the (4, fc)-UGC suffices 
to get a hardness of 0(k/2 k ) for Max fc-CSP, which is tight up to a constant 
factor. 

For q a prime power and large enough so that q > k, there are t-wise in- 
dependent distributions over [q] k with support size q l based on evaluating a 
random degree-t polynomial over ¥ q . Thus, in this setting, the (t + l,fc)-UGC 
implies a hardness factor of q t ~ k for the Max fc-CSPq problem. 
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In the remainder of this section, we will focus on the details of constructions 
of pairwise independence, giving hardness for Max k-CSP q under the (3, fc)- 
UGC. 

4.1 Theorems E2] and El 

The pairwise independent distributions used to give Theorems 11.21 and 11.31 are 
both based on the following simple lemma, which is well-known but stated here 
in a slightly more general form than usual: 

Lemma 4.2. Let R be a finite commutative ring, and let u,v S R" be two 
vectors over R such that UiVj — UjVi € R* for some Let X 6 R n be 

a uniformly random vector over R n and let a be the probability distribution 
over R 2 of ((u,X) ,(v,X)) <E R 2 . Then fi is a balanced pairwise independent 
distribution. 

Proof. Without loss of generality, assume that i = 1 and j = 2. It suffices to 
prove that, for all (a, b) £ R 2 and any choice of values of X3, . . . , X n , we have 

Pr[«u, X) , (v, X)) = (a, b)\X 3 ,..., X n ] = l/\R\ 2 . 

For this to be true, we need that the system 

J U1X1 + u 2 X 2 = a' 
\ V1X1 + v 2 X 2 = V 

has exactly one solution, where a! = a — J27=3 u iXi and similarly for b' . This 
in turn follows directly from the condition on u and v. □ 

Consequently, given a set of m vectors in R n such that any pair of them 
satisfy the condition of Lemma I4.2[ we can construct a pairwise independent 
distribution over R m with support size |-R| n . 

Let us now prove Theorem [Ol 

Proof of Theorem \1.2l Let r = |~log 2 fe+1] . For a nonempty S C [r], let us € H q 
be the characteristic vector of S, i.e., us,i = 1 if i € S, and otherwise. Then, 
for any S ^ T ', the vectors us and ut satisfy the condition of Lemma [L2j and 
thus, we have that ({us, X))sc[ r ] for a uniformly random X 6 Z£ induces a 
balanced pairwise independent distribution over Z 2 , with support size q r . 

When k = 2 r — 1 we get a hardness of q lo S2( fc )- fe ) but for general values of 
k, in particular k — 2 r_1 , we may lose up to a factor q. □ 

We remark that for q = 2 this construction gives exactly the predicate used 
by Samorodnitsky and Trevisan [21], giving an inapproximability of 2k /2 k for 
all k, and (k + l)/2 k for all k of the form 2 l - 1. 

Intuitively, it should be clear that when we have more structure on R in 
Lemma [421 we should be able to find a larger collection of vectors where every 
pair satisfies the "independence condition". This intuition leads us to Theo- 
rem ll.3[ dealing with the special case of Theorem 11.21 in which q is a prime 
power. The construction of Theorem 11.31 is essentially the same as that of [T7] . 

1 Ft* denotes the set of units of -R. In the case that R is a field, the condition is equivalent 
to saying that u and v are linearly independent. 
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Proof of TheoremEJi Let r = \\og q (k(q- 1) + 1)] , and n = (q r - l)/(q- 1) > k. 

Let P(F£) be the projective space over ¥ r q , i.e., P(F£) = (¥ r q \ 0)/-. Here ~ 
is the equivalence relation defined by (x\, . . . , x r ) ~ (j/i, . . . , y r ) if there exists 
a c € F* such that Xi — cy, for all z, i.e., if (xi, . . . , x r ) and (j/i, . . . , y r ) are 
linearly independent. We then have |P(F£)| = (q r — l)/(q — 1) = n. 

Choose n vectors u\, . . . ,u n S ¥ q as representatives from each of the equiva- 
lence classes of P(F£). Then any pair itj, Uj satisfy the condition of Lemma l4~2j 
and as in Theorem \1.2\ we get a balanced pairwise independent distribution 
over F™, with support size q r . 

When k = (q r — l)/(q — 1), this gives a hardness of k(q — 1) + 1, and for 
general fc, in particular k — {q r ~ 1 — l)/(<? — 1) + 1, we lose a factor q in the 
hardness ratio. □ 

Again, for q = 2, this construction gives the same predicate used by Samorod- 
nitsky and Trevisan. In the case that q > k, we get a hardness of q 2 /q k , the 
same factor as we get from the general construction for i-wise independence 
mentioned at the beginning of this section. 



4.2 Theorem [TT4l 

Let us now look closer at the special case of boolean variables, i.e., q = 2. So 
far, we have only given a different proof of Samorodnitsky and Trevisan's result, 
but we will now show how to improve this. 

An Hadamard matrix is an n x n matrix over ±1 such that HH T = nl, i.e., 
each pair of rows, and each pair of columns, are orthogonal. Let h(n) denote 
the smallest n' > n such that there exists an n' x n' Hadamard matrix. It is a 
well-known fact that Hadamard matrices give small pairwise independent distri- 
butions and thus give hardness of approximating Max fc-CSP. To be specific, 
we have the following proposition: 

Proposition 4.3. For every k > 3, the (3, k)-UGC implies that the Max fc- 
CSP problem is UG-hard to approximate within h(k + l)/2 k + e. 

Proof. Let n = h{k + 1) and let A be an n x n Hadamard matrix, normalized so 
that one column contains only ones. Remove n — k of the columns, including the 
all-ones column, and let A' be the resulting n x k matrix. Let fi : { — 1, l} k — > 
[0, 1] be the probability distribution which assigns probability 1/n to each row 
of A 1 . Then /x is a balanced pairwise independent distribution with | Supp(/i)| = 
h(k + l). ' □ 

It is well known that Hadamard matrices can only exist for n = 1, n = 2, 
and n = (mod 4). The famous Hadamard Conjecture asserts that Hadamard 
matrices exist for all n which are divisible by 4, in other words, that h(n) — 
4[~n/4] < n + 3. It is also possible to get useful unconditional bounds on h(n). 
We now give one such easy bound. 

Theorem 4.4 ([19|). For every odd prime p and integers e, / > 0, there exists 
an n x n Hadamard matrix H n where n — 2 e (pf + 1), whenever this number is 
divisible by 4. 

Theorem 4.5 (|4J). There exists an integer uq such that for every n > uq. 
there is a prime p between n and n + n 525 . 
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Corollary 4.6. We have: h(n) < n + O(n - 525 ) . 

Proof. Let p be the smallest prime larger than n/2, and let n' = 2{p + 1) > n. 
Then, Theorem 14.41 asserts that there exists an n' x n' Hadamard matrix, so 
h(n) < n'. If n is sufficiently large (n > 2no), then by Theorem I4.5[ p < 
n/2 + (n/2) 525 and n' < n + 2n 525 , as desired. □ 

Theorem 1 1 .41 follows from Proposition 14.31 and Corollary 14.61 

It is probably possible to get a stronger unconditional bound on h(n) than 

the one given by Corollary 14.61 by using stronger construction techniques than 

the one of Theorem 14.41 

5 Discussion 

We have given a strong sufficient condition for predicates to be hereditary ap- 
proximation resistant under (a weakened version of) the Unique Games Con- 
jecture: it suffices for the set of satisfying assignments to contain a balanced 
pairwise independent distribution. Using constructions of small such distribu- 
tions, we were then able to construct approximation resistant predicates with 
few accepting inputs, which in turn gave improved hardness for the Max k- 
CSP g problem. 

There are several aspects here where there is room for interesting further 
work: 

As mentioned earlier, we do not know whether the (t, A:)-UGC implies the 
"standard" UGC for large values of t. In particular, proving the (i, fc)-UGC for 
some t < %/fc/logfc would give hardness for Max fc-CSP better than the best 
current NP-hardness result. But even understanding the (k, A;)-UGC seems like 
an interesting question. 

A very natural and interesting question is whether our condition is also nec- 
essary for a predicate to be hereditary approximation resistant, i.e., if pairwise 
independence gives a complete characterization of hereditary approximation re- 
sistance. 

Finally, it is natural to ask whether our results for Max k-CSP q can be 
pushed a bit further, or whether they are tight. For the case of boolean variables, 
Hast [9] proved that any predicate accepting at most 2[fc/2j +1 inputs is not 
approximation resistant. For k = 2, 3 (mod 4) this exactly matches the result 
we get under the UGC and the Hadamard Conjecture (which for k = 2 r — 1 and 
k = 2 r — 2 is the same hardness as [21]). For k = 0,1 (mod 4), we get a gap of 
2 between how few satisfying assignments an approximation resistant predicate 
can and cannot have. 

Thus, the hitherto very succesful approach of obtaining hardness for Max k- 
CSP by finding "small" approximation resistant predicate, can not be taken fur- 
ther, but there is still a small constant gap of roughly 1/0.44 to the best current 
algorithm. It would be interesting to know whether the algorithm can be im- 
proved, or whether the hardest instances of Max fc-CSP are not Max CSP(F) 
instances for some approximation resistant P. 

For larger q, this situation becomes a lot worse. When q = 2 l and k = 
(q r — l)/(q — 1), we have a gap of Q(q/ log 2 q) between the best algorithm and 
the best inapproximability, and for general values of q and k, the gap is even 
larger. 
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A Good labellings from influential variables 

Lemma A.l. Let X be a k-ary Unique Label Cover instance. Furthermore, for 
each vertex v, let f v : [q] k — * [q] and define 



9v,a( X ) 



1 if fvi = a 
otherwise 



Then if there is a fraction of at least e edges e — (v%,... ,Vk) with a vector 
a G [q] k , an index i G [L] and a set J G [k] of \ J\ = t indices such that 

Inf^ 1( . } (^ >aj )>r (3) 

for all j G J, then Opt t (A) > S := e (j^) ■ 

Proof. For each v € V, let 

C(v) = { i | Inif d (g v , a ) > r for some a G [q] }. 

Note that \C(v)\ < q ■ d/r. 

Define a labelling £ : V — > [L] by picking, for each v G V, a label £(v) 
uniformly at random from C(v) (or an arbitrary label in case C(v) is empty). Let 
e = (vi, . . . ,Vk) be an edge satisfying Equation [31 Then for all j G J, ^J 1 ^) G 
C(vj), and thus, the probability that iTj(£(vj)) = i is l/\C(vj)\. This implies 
that the probability that this edge is i-wise satisfied is at least Y\j e j l/|C( w j)| > 

( TTq) • Overall, the total expected number of edges that are i-wise satisfied by 
is at least 5 — e C^) > an d thus Opt t (A) > 5. □ 



15 



